Search CORE

21 research outputs found

Data Optimization in Deep Learning: A Survey

Author: Wu Ou
Yao Rujing
Publication venue
Publication date: 25/10/2023
Field of study

Large-scale, high-quality data are considered an essential factor for the successful application of many deep learning techniques. Meanwhile, numerous real-world deep learning tasks still have to contend with the lack of sufficient amounts of high-quality data. Additionally, issues such as model robustness, fairness, and trustworthiness are also closely related to training data. Consequently, a huge number of studies in the existing literature have focused on the data aspect in deep learning tasks. Some typical data optimization techniques include data augmentation, logit perturbation, sample weighting, and data condensation. These techniques usually come from different deep learning divisions and their theoretical inspirations or heuristic motivations may seem unrelated to each other. This study aims to organize a wide range of existing data optimization methodologies for deep learning from the previous literature, and makes the effort to construct a comprehensive taxonomy for them. The constructed taxonomy considers the diversity of split dimensions, and deep sub-taxonomies are constructed for each dimension. On the basis of the taxonomy, connections among the extensive data optimization methods for deep learning are built in terms of four aspects. We probe into rendering several promising and interesting future directions. The constructed taxonomy and the revealed connections will enlighten the better understanding of existing methods and the design of novel data optimization techniques. Furthermore, our aspiration for this survey is to promote data optimization as an independent subdivision of deep learning. A curated, up-to-date list of resources related to data optimization in deep learning is available at \url{https://github.com/YaoRujing/Data-Optimization}

arXiv.org e-Print Archive

Codebook Configuration for 1-bit RIS-aided Systems Based on Implicit Neural Representations

Author: Fan Zhijie
Ling Zenan
Mi Tiebin
Qiu Robert Caiming
Xiao Yao
Xiong Rujing
Publication venue
Publication date: 01/06/2023
Field of study

Reconfigurable intelligent surfaces (RISs) have become one of the key technologies in 6G wireless communications. By configuring the reflection beamforming codebooks, RIS focuses signals on target receivers. In this paper, we investigate the codebook configuration for 1-bit RIS-aided systems. We propose a novel learning-based method built upon the advanced methodology of implicit neural representations. The proposed model learns a continuous and differentiable coordinate-to-codebook representation from samplings. Our method only requires the information of the user's coordinate and avoids the assumption of channel models. Moreover, we propose an encoding-decoding strategy to reduce the dimension of codebooks, and thus improve the learning efficiency of the proposed method. Experimental results on simulation and measured data demonstrated the remarkable advantages of the proposed method

arXiv.org e-Print Archive

Precursor apportionment of atmospheric oxygenated organic molecules using a machine learning method

Author: Bianchi Federico
Donahue Neil M.
Guo Yishuo
Huang Dandan
Jiang Jingkun
Kulmala Markku
Li Xiaoxiao
Liu Yongchun
Nie Wei
Qiao Xiaohui
Sarnela Nina
Wang Zhe
Yan Chao
Yao Lei
Yin Rujing
Publication venue
Publication date: 03/12/2022
Field of study

Publisher Copyright: © 2023 The Author(s). Published by the Royal Society of Chemistry.Gas-phase oxygenated organic molecules (OOMs) can contribute significantly to both atmospheric new particle growth and secondary organic aerosol formation. Precursor apportionment of atmospheric OOMs connects them with volatile organic compounds (VOCs). Since atmospheric OOMs are often highly functionalized products of multistep reactions, it is challenging to reveal the complete mapping relationships between OOMs and their precursors. In this study, we demonstrate that the machine learning method is useful in attributing atmospheric OOMs to their precursors using several chemical indicators, such as O/C ratio and H/C ratio. The model is trained and tested using data acquired in controlled laboratory experiments, covering the oxidation products of four main types of VOCs (isoprene, monoterpenes, aliphatics, and aromatics). Then, the model is used for analyzing atmospheric OOMs measured in both urban Beijing and a boreal forest environment in southern Finland. The results suggest that atmospheric OOMs in these two environments can be reasonably assigned to their precursors. Beijing is an anthropogenic VOC dominated environment with ~64% aromatic and aliphatic OOMs, and the other boreal forested area has ~76% monoterpene OOMs. This pilot study shows that machine learning can be a promising tool in atmospheric chemistry for connecting the dots.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Contribution of Atmospheric Oxygenated Organic Compounds to Particle Growth in an Urban Environment

Author: Bianchi Federico
Cai Runlong
Deng Chenjuan
Donahue Neil M.
Guo YiShuo
Huang Dandan
Jiang Jingkun
Kulmala Markku
Li Chang
Li Xiaoxiao
Liu Yongchun
Nie Wei
Qiao Xiaohui
Wang Mingyi
Wang Zhe
Worsnop Douglas R.
Yan Chao
Yao Lei
Yin Rujing
Publication venue
Publication date: 19/10/2021
Field of study

Gas-phase oxygenated organic molecules (OOMs) can contribute substantially to the growth of newly formed particles. However, the characteristics of OOMs and their contributions to particle growth rate are not well understood in urban areas, which have complex anthropogenic emissions and atmospheric conditions. We performed long-term measurement of gas-phase OOMs in urban Beijing during 2018-2019 using nitrate-based chemical ionization mass spectrometry. OOM concentrations showed clear seasonal variations, with the highest in the summer and the lowest in the winter. Correspondingly, calculated particle growth rates due to OOM condensation were highest in summer, followed by spring, autumn, and winter. One prominent feature of OOMs in this urban environment was a high fraction (similar to 75%) of nitrogen-containing OOMs. These nitrogen-containing OOMs contributed only 50-60% of the total growth rate led by OOM condensation, owing to their slightly higher volatility than non-nitrate OOMs. By comparing the calculated condensation growth rates and the observed particle growth rates, we showed that sulfuric acid and its clusters are the main contributors to the growth of sub-3 nm particles, with OOMs significantly promoting the growth of 3-25 nm particles. In wintertime Beijing, however, there are missing contributors to the growth of particles above 3 nm, which remain to be further investigated.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Particle growth with photochemical age from new particle formation to haze in the winter of Beijing, China

Author: Bianchi Federico
Cai Jing
Chen Xuemeng
Chu Biwu
Dada Lubna
Deng Chenjuan
Du Wei
Dällenbach K.R.
Feng Zeming
Fu Yueyun
He Hong
He Xucheng
Jiang Jingkun
Kangasluoma Juha
Kerminen Veli-Matti
Kujansuu Joni
Kulmala Markku
Li Haiyan
Liu Yongchun
Petäjä Tuukka
Simonen Pauli
Wang Yonghong
Yan Chao
Yao Lei
Yin Rujing
Zhou Ying
Publication venue
Publication date: 20/01/2021
Field of study

Secondary aerosol formation in the aging process of primary emission is the main reason for haze pollution in eastern China. Pollution evolution with photochemical age was studied for the first time at a comprehensive field observation station during winter in Beijing. The photochemical age was used as an estimate of the time scale attributed to the aging process and was estimated from the ratio of toluene to benzene in this study. A low photochemical age indicates a fresh emission. The photochemical age of air masses during new particle formation (NPF) days was lower than that on haze days. In general, the strongest NPF events, along with a peak of the formation rate of 1.5 nm(J(1.5)) and 3 nmparticles (J(3)), were observed when the photochemical age was between 12 and 24 h while rarely took place with photochemical ages less than 12 h. When photochemical age was larger than 48 h, haze occurred and NPF was suppressed. The sources and sinks of nanoparticles had distinct relation with the photochemical age. Our results show that the condensation sink (CS) showed a valley with photochemical ages ranging from 12 to 24 h, while H2SO4 concentration showed no obvious trend with the photochemical age. The high concentrations of precursor vapours within an air mass lead to persistent nucleation with photochemical age ranging from 12 to 48 h in winter. Coincidently, the fast increase of PM2.5 mass was also observed during this range of photochemical age. Noteworthy, CS increased with the photochemical age on NPF days only, which is the likely reason for the observation that the PM2.5 mass increased faster with photochemical age on NPF days compared with other days. The evolution of particles with the photochemical age provides new insights into understanding how particles originating from NPF transform to haze pollution. (C) 2020 Elsevier B.V. All rights reserved.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Trepo - Institutional Repository of Tampere University

Sources and sinks driving sulfuric acid concentrations in contrasting environments : implications on proxy calculations

Sulfuric acid has been shown to be a key driver for new particle formation and subsequent growth in various environments, mainly due to its low volatility. However, direct measurements of gas-phase sulfuric acid are oftentimes not available, and the current sulfuric acid proxies cannot predict, for example, its nighttime concentrations or result in significant discrepancies with measured values. Here, we define the sources and sinks of sulfuric acid in different environments and derive a new physical proxy for sulfuric acid to be utilized in locations and during periods when it is not measured. We used H2SO4 measurements from four different locations: Hyytiala, Finland; Agia Marina, Cyprus; Budapest, Hungary; and Beijing, China, representing semi-pristine boreal forest, rural environment in the Mediterranean area, urban environment and heavily polluted megacity, respectively. The new proxy takes into account the formation of sulfuric acid from SO2 via OH oxidation and other oxidation pathways, specifically via stabilized Criegee intermediates. The sulfuric acid sinks included in the proxy are its condensation sink (CS) and atmospheric clustering starting from H2SO4 dimer formation. Indeed, we found that the observed sulfuric acid concentration can be explained by the proposed sources and sinks with similar coefficients in the four contrasting environments where we have tested it. Thus, the new proxy is a more flexible and an important improvement over previous proxies. Following the recommendations in this paper, a proxy for a specific location can be derived.Peer reviewe

Helsingin yliopiston digitaalinen arkisto